Created: 2026-03-06 07:53:04
Updated: 2026-03-06 07:53:04

3.1 The AEP

The asymptotic equipartition property is formalized in the following theorem:
Theorem 3.1.1: If X1,X2,X_{1},X_{2},\dots are i.i.d p(x)\sim p(x), then

1nlogp(X1,X2,,Xn)H(X)in probability-\frac{1}{n}\log p(X_{1},X_{2},\dots,X_{n})\to H(X) \qquad\text{in probability}

proof: 独立随机变量的函数依然是独立随机变量,因此

1nlogp(X1,X2,,Xn)=1nilogp(Xi)Elogp(X)in probability=H(X)\begin{align} -\frac{1}{n} \log p(X_{1},X_{2},\dots,X_{n}) & =-\frac{1}{n}\sum_{i}\log p(X_{i}) \\ & \to -E\log p(X) & \text{in probability} \\ & =H(X) \end{align}

定义:p(x)p(x)的典型集合(typical set)Aϵ(n)A_{\epsilon}^{(n)}是满足如下条件的序列(x1,x2,xn)Hn(x_{1},x_{2},\dots x_{n})\in\mathscr{H}^{n}

2n(H(X)+ϵ)p(x1,x2,,xn)2n(H(X)ϵ)2^{-n(H(X)+\epsilon)}\leq p(x_{1},x_{2},\dots,x_{n})\leq 2^{-n(H(X)-\epsilon)}

由于AEP的性质,我们可以得出Aϵ(n)A_{\epsilon}^{(n)}的如下性质:

  1. (x1,x2,,xn)Aϵ(n)    H(X)ϵ1nlogp(x1,x2,,xn)H(X)+ϵ(x_{1},x_{2},\dots,x_{n})\in A_{\epsilon}^{(n)}\implies H(X)-\epsilon\leq-\frac{1}{n}\log p(x_{1},x_{2},\dots,x_{n})\leq H(X)+\epsilon
    proof:
    Aϵ(n)A_{\epsilon}^{(n)}的定义可直接得出。

  2. Pr{Aϵ(n)}>1ϵ\text{Pr}\{A_{\epsilon}^{(n)}\}>1-\epsilon for nn sufficiently large
    proof:由Theorem 3.1.1,

    Pr{1nlogp(X1X2Xn)H(X)<ϵ}1in probility\text{Pr}\left\{ \mid-\frac{1}{n}\log p(X_{1}X_{2}\dots X_{n})-H(X)\mid <\epsilon\right\}\to1\qquad\text{in probility}

    因此任意δ\delta,n0,such that nn0\exists n_{0},\text{such that} \forall\ n\geq n_{0}时,我们有

    Pr{1nlogp(X1,X2,,Xn)H(X)<ϵ}>1δ\text{Pr}\left\{ \mid-\frac{1}{n}\log p(X_{1},X_{2},\dots,X_{n})-H(X)\mid <\epsilon\right\}>1-\delta

    δ=ϵ\delta=\epsilon即得证。

  3. Aϵ(n)2n(H(X)+ϵ)\mid A_{\epsilon}^{(n)}\mid\leq 2^{n(H(X)+\epsilon)}

  4. Aϵ(n)(1ϵ)2n(H(X)ϵ)\mid A_{\epsilon}^{(n)}\mid \geq (1-\epsilon)2^{n(H(X)-\epsilon)},对充分大的nn成立。

Theorem 3.2.1: Let XnX^n be i.i.d. p(x)\sim p(x). Let ϵ>0\epsilon>0, then \exists code which maps sequences xnx^n of length nn into binary strings such that the mapping is one-to-one(therefore invertible) and

E[1nl(Xn)]H(X)+ϵE\left[ \frac{1}{n}l(X^n) \right]\leq H(X)+\epsilon

for n sufficiently large.

3.3 High probability sets and the typical set

Let Bδ(n)HnB_{\delta }^{(n)}\subset \mathscr{H}^n be the any set such that Pr{Bδ(n)}1δ\text{Pr}\{B_{\delta}^{(n)}\}\geq 1-\delta. We argue that Bδ(n)B_{\delta}^{(n)} must be significant intersection with Aϵ(n)A^{(n)}_{\epsilon} and therefore must have about as many elements.

Theorem 3.3.1:
Let X1,X2,XnX_{1},X_{2},\dots X_{n} be i.i.d ~p(x)p(x) . For δ<12\delta< \frac{1}{2} and any δ<0\delta'<0, if Pr{Bδn}>1δPr\{B^{n}_{\delta}\}>1-\delta, then

1nlogBδ(n)>Hδfor n sufficiently large\frac{1}{n} \log|B^{(n)}_{\delta}| > H-\delta'\qquad\text{for n sufficiently large}

Leave a Comment

captcha
Fontsize